1  Stationary Time Series

What is a Stationary Time Series?

The R code below shows a time series with a constant mean and variance (i.e. stationary time series).

Reveal the Secret Within
# Generate a Stationary Time Series
# Ensure your results are reproducible
set.seed(123)  

# Simulate 120 data points
stationary_ts <- ts(rnorm(120, mean = 50, sd = 5), frequency = 12)  

# Create a tsibble for it
stationary_df <- tsibble::as_tsibble(stationary_ts)
stationary_df <- dplyr::mutate(stationary_df, t = dplyr::row_number())

# Normally distributed, mean of 50, variance of 25  
plot(stationary_ts) # Visualize the series 

1.0.1 Autocorrelation and Partial Autocorrelation Function

This section is for generating Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots of the time series data. The function acf2 is used here.

  • No Systematic Trends: Think of it as your data hovering around a roughly constant average value over time. No clear upward or downward drifts.

  • Variance is Consistent: Your data doesn’t get systematically more ‘spread out’ as time goes on.

  • Why It Matters: Lots of time series analysis methods, including ARIMA modeling, often perform best when your data is stationary (or you’ve made transformations to achieve this).

Let’s take a look at the ACF and PACF plots of this series in order to identify the best model for this series.

Reveal the Secret Within
# ACF plot for Stationary Time Series 
acf_stationary <- astsa::acf2(stationary_ts, main="ACF of Stationary Time Series") 

Interpreting the ACF Plot (acf_stationary)

  1. Significant Spikes: Are there tall bars exceeding the blue dashed lines early on? This implies correlation at specific lags (e.g., today is similar to yesterday).

  2. Decaying Pattern: Does the spike height drop rapidly, with most bars within the dashed lines after a few lags? This is common for stationary series, as further ‘echoes’ in time get fainter.

  3. No Obvious Seasonality: If data had monthly cycles, your ACF would reflect it with repeated spikes every 12 lags. You likely won’t see this here.

What about the PACF?

  • The PACF would likely not show many major spikes beyond the first couple. That tells us, once you account for very short-term correlation, your ‘echoes’ mostly disappear!

1.0.2 Statistical Tests for Autocorrelation

Other common methods is to use the Box-Ljung test or the Durbin-Watson test. These tests are used to null hypothesis that there is no autocorrelation in the data. If the p-value of the test is less than a certain significance level (e.g., 0.05), then we can reject the null hypothesis and conclude that there is autocorrelation in the data.

  • Durbin-Watson Test (dwtest): A formal statistical test to detect if there’s significant autocorrelation in the residuals.

  • Ljung-Box Test (Box.test): The Ljung-Box test checks for autocorrelation in the residuals of a time series model. Autocorrelation here means that the residuals (errors) of the model are correlated with each other at different lags.

Null Hypothesis:

  • H0: The data are independently distributed (i.e. the correlations in the population from which the sample is taken are 0, so that any observed correlations in the data result from randomness of the sampling process).

Here are some steps you can take to check for autocorrelation in your time series data:

  1. Fit a linear regression model to your data.

    Reveal the Secret Within
    # Simple Linear Regression Model 
    stationary_simple <- stats::lm(value ~ t, data = stationary_df)  
    # View Model Summary 
    summary(stationary_simple)
    
    Call:
    stats::lm(formula = value ~ t, data = stationary_df)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -11.5272  -2.8354  -0.2433   2.9391  11.1638 
    
    Coefficients:
                 Estimate Std. Error t value            Pr(>|t|)    
    (Intercept) 50.581625   0.823364  61.433 <0.0000000000000002 ***
    t           -0.008337   0.011810  -0.706               0.482    
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 4.482 on 118 degrees of freedom
    Multiple R-squared:  0.004206,  Adjusted R-squared:  -0.004233 
    F-statistic: 0.4984 on 1 and 118 DF,  p-value: 0.4816
  2. Calculate the residuals from the model.

    Reveal the Secret Within
    stationary_simple_residuals <- resid(stationary_simple)  
    ## Visual Check for Patterns 
    plot(resid(stationary_simple)) + title(main = "Residual Plot") 

    integer(0)
  3. Plot the autocorrelation function (ACF) of the residuals.

    Reveal the Secret Within
    ## Autocorrelation Check   
    astsa::acf2(resid(stationary_simple), main= paste0("Autocorrelation Function (ACF) of Residuals"))

         [,1]  [,2] [,3]  [,4] [,5] [,6] [,7]  [,8]  [,9] [,10] [,11] [,12] [,13]
    ACF  0.01 -0.08 0.12 -0.08 0.04 0.04 0.02 -0.02 -0.08 -0.08  0.06 -0.12 -0.11
    PACF 0.01 -0.08 0.12 -0.09 0.06 0.01 0.05 -0.04 -0.07 -0.10  0.06 -0.14 -0.09
         [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21]
    ACF   0.13 -0.08 -0.09  0.12 -0.08 -0.02 -0.06  0.01
    PACF  0.09 -0.05 -0.07  0.09 -0.07  0.01 -0.11  0.04
  4. Perform a Durbin-Watson test on the residuals.

Reveal the Secret Within
## Durbin-Watson Test   
lmtest::dwtest(stationary_df$value ~ stationary_df$t)

    Durbin-Watson test

data:  stationary_df$value ~ stationary_df$t
DW = 1.9701, p-value = 0.3979
alternative hypothesis: true autocorrelation is greater than 0
  1. Perform a Box-Ljung test on the residuals.
Reveal the Secret Within
# Box-Ljung test 
Box.test(stationary_simple$residuals, lag = 24, type = "Ljung-Box")

    Box-Ljung test

data:  stationary_simple$residuals
X-squared = 20.721, df = 24, p-value = 0.6551